Goto

Collaborating Authors

 failure scenario


Querying Labeled Time Series Data with Scenario Programs

Kim, Edward, Shanker, Devan, Bharadwaj, Varun, Park, Hongbeen, Kim, Jinkyu, Torfah, Hazem, Fremont, Daniel J, Seshia, Sanjit A

arXiv.org Artificial Intelligence

Simulation-based testing has become a crucial complement to road testing for ensuring the safety of cyber physical systems (CPS). As a result, significant research efforts have been directed toward identifying failure scenarios within simulation environments. However, a critical question remains. Are the AV failure scenarios discovered in simulation reproducible on actual systems in the real world? The sim-to-real gap caused by differences between simulated and real sensor data means that failure scenarios identified in simulation might either be artifacts of synthetic sensor data or actual issues that also occur with real sensor data. To address this, an effective approach to validating simulated failure scenarios is to locate occurrences of these scenarios within real-world datasets and verify whether the failure persists on the datasets. To this end, we introduce a formal definition of how labeled time series sensor data can match an abstract scenario, represented as a scenario program using the Scenic probabilistic programming language. We present a querying algorithm that, given a scenario program and a labeled dataset, identifies the subset of data that matches the specified scenario. Our experiment shows that our algorithm is more accurate and orders of magnitude faster in querying scenarios than the state-of-the-art commercial vision large language models, and can scale with the duration of queried time series data.


FailSafe: Reasoning and Recovery from Failures in Vision-Language-Action Models

Lin, Zijun, Duan, Jiafei, Fang, Haoquan, Fox, Dieter, Krishna, Ranjay, Tan, Cheston, Wen, Bihan

arXiv.org Artificial Intelligence

Recent advances in robotic manipulation have integrated low-level robotic control into Vision-Language Models (VLMs), extending them into Vision-Language-Action (VLA) models. Although state-of-the-art VLAs achieve strong performance in downstream robotic applications, supported by large-scale crowd-sourced robot training data, they still inevitably encounter failures during execution. Enabling robots to reason and recover from unpredictable and abrupt failures remains a critical challenge. Existing robotic manipulation datasets, collected in either simulation or the real world, primarily provide only ground-truth trajectories, leaving robots unable to recover once failures occur. Moreover, the few datasets that address failure detection typically offer only textual explanations, which are difficult to utilize directly in VLA models. To address this gap, we introduce FailSafe, a novel failure generation and recovery system that automatically produces diverse failure cases paired with executable recovery actions. FailSafe can be seamlessly applied to any manipulation task in any simulator, enabling scalable creation of failure action data. To demonstrate its effectiveness, we fine-tune LLaVa-OneVision-7B (LLaVa-OV-7B) to build FailSafe-VLM. Experimental results show that FailSafe-VLM successfully helps robotic arms detect and recover from potential failures, improving the performance of three state-of-the-art VLA models (pi0-FAST, OpenVLA, OpenVLA-OFT) by up to 22.6% on average across several tasks in Maniskill. Furthermore, FailSafe-VLM could generalize across different spatial configurations, camera viewpoints, object and robotic embodiments. We plan to release the FailSafe code to the community.


ATLAS: AI-Native Receiver Test-and-Measurement by Leveraging AI-Guided Search

Belgiovine, Mauro, Pradhan, Suyash, Lange, Johannes, Löhning, Michael, Chowdhury, Kaushik

arXiv.org Artificial Intelligence

--Industry adoption of Artificial Intelligence (AI)- native wireless receivers, or even modular, Machine Learning (ML)-aided wireless signal processing blocks, has been slow. The main concern is the lack of explainability of these trained ML models and the significant risks posed to network functionalities in case of failures, especially since (i) testing on every exhaustive case is infeasible and (ii) the data used for model training may not be available. This paper proposes A TLAS, an AI-guided approach that generates a battery of tests for pre-trained AI-native receiver models and benchmarks the performance against a classical receiver architecture. Using gradient-based optimization, it avoids spanning the exhaustive set of all environment and channel conditions; instead, it generates the next test in an online manner to further probe specific configurations that offer the highest risk of failure. We implement and validate our approach by adopting the well-known DeepRx AI-native receiver model as well as a classical receiver using differentiable tensors in NVIDIA's Sionna environment. A TLAS uncovers specific combinations of mobility, channel delay spread, and noise, where fully and partially trained variants of AI-native DeepRx perform suboptimally compared to the classical receivers. Our proposed method reduces the number of tests required per failure found by 19% compared to grid search for a 3-parameters input optimization problem, demonstrating greater efficiency.


Learning-Based Passive Fault-Tolerant Control of a Quadrotor with Rotor Failure

Chen, Jiehao, Zhao, Kaidong, Liu, Zihan, Li, YanJie, Lou, Yunjiang

arXiv.org Artificial Intelligence

Learning-Based Passive Fault-T olerant Control of a Quadrotor with Rotor Failure Jiehao Chen, Kaidong Zhao, Zihan Liu, Y anJie Li*, Y unjiang Lou Abstract -- This paper proposes a learning-based passive fault-tolerant control (PFTC) method for quadrotor capable of handling arbitrary single-rotor failures, including conditions ranging from fault-free to complete rotor failure, without requiring any rotor fault information or controller switching. Unlike existing methods that treat rotor faults as disturbances and rely on a single controller for multiple fault scenarios, our approach introduces a novel Selector-Controller network structure. This architecture integrates fault detection module and the controller into a unified policy network, effectively combining the adaptability to multiple fault scenarios of PFTC with the superior control performance of active fault-tolerant control (AFTC). T o optimize performance, the policy network is trained using a hybrid framework that synergizes reinforcement learning (RL), behavior cloning (BC), and supervised learning with fault information. Extensive simulations and real-world experiments validate the proposed method, demonstrating significant improvements in fault response speed and position tracking performance compared to state-of-the-art PFTC and AFTC approaches. I. INTRODUCTION As drones are increasingly applied across various industries, safety concerns have garnered significant attention. Among these concerns, rotor failures are particularly critical, often leading to the immediate crash of the drone.


Hierarchical Fallback Architecture for High Risk Online Machine Learning Inference

Polleti, Gustavo, Santana, Marlesson, Del Sant, Felipe Sassi, Fontes, Eduardo

arXiv.org Artificial Intelligence

These systems can fail unexpectedly in a variety of different ways. Notably, applications Open Banking powered machine learning applications require novel that rely on online inference are subject to their inability robustness approaches to deal with challenging stress and failure to keep up with the expected operating procedures while, now scenarios. In this paper we propose an hierarchical fallback architecture additionally, having to make tedious computational tasks for these for improving robustness in high risk machine learning AI/ML applications, typically resulting in timeouts, infrastructure applications with a focus in the financial domain. We define generic outages and, often, failures in external dependencies such as third failure scenarios often found in online inference that depend on party data providers (external API calls) [7]. When the underlying external data providers and we describe in detail how to apply the machine learning applications are presented with strong robustness hierarchical fallback architecture to address them. Finally, we offer requirements, fallback or fall-over strategies are needed to keep a real world example of its applicability in the industry for near-real operations running, even in the event of unexpected failures. In time transactional fraud risk evaluation using Open Banking data finance, specifically applications that require real time risk mitigation and under extreme stress scenarios.


Rescheduling after vehicle failures in the multi-depot rural postman problem with rechargeable and reusable vehicles

Sathyamurthy, Eashwar, Herrmann, Jeffrey W., Azarm, Shapour

arXiv.org Artificial Intelligence

We present a centralized auction algorithm to solve the Multi-Depot Rural Postman Problem with Rechargeable and Reusable Vehicles (MD-RPP-RRV), focusing on rescheduling arc routing after vehicle failures. The problem involves finding heuristically obtained best feasible routes for multiple rechargeable and reusable vehicles with capacity constraints capable of performing multiple trips from multiple depots, with the possibility of vehicle failures. Our algorithm auctions the failed trips to active (non-failed) vehicles through local auctioning, modifying initial routes to handle dynamic vehicle failures efficiently. When a failure occurs, the algorithm searches for the best active vehicle to perform the failed trip and inserts the trip into that vehicle's route, which avoids a complete rescheduling and reduces the computational effort. We compare the algorithm's solutions against offline optimal solutions obtained from solving a Mixed Integer Linear Programming (MILP) formulation using the Gurobi solver; this formulation assumes that perfect information about the vehicle failures and failure times is given. The results demonstrate that the centralized auction algorithm produces solutions that are, in some cases, near optimal; moreover, the execution time for the proposed approach is much more consistent and is, for some instances, orders of magnitude less than the execution time of the Gurobi solver. The theoretical analysis provides an upper bound for the competitive ratio and computational complexity of our algorithm, offering a formal performance guarantee in dynamic failure scenarios.


Transformer-Based Fault-Tolerant Control for Fixed-Wing UAVs Using Knowledge Distillation and In-Context Adaptation

Giral, Francisco, Gómez, Ignacio, Vinuesa, Ricardo, Le-Clainche, Soledad

arXiv.org Artificial Intelligence

Abstract-- This study presents a transformer-based approach for fault-tolerant control in fixed-wing Unmanned Aerial Vehicles (UAVs), designed to adapt in real time to dynamic changes caused by structural damage or actuator failures. Employing a teacher-student knowledge distillation framework, the proposed approach trains a student agent with partial observations by transferring knowledge from a privileged expert agent with full observability, enabling robust performance across diverse failure scenarios. In recent years, Unmanned Aerial Vehicles (UAVs) have been widely used to perform various applications in complex However, complex environments and demanding tasks can and critical scenarios, such as search and rescue or cause structural damage to the UAV, altering its aerodynamic autonomous medical transportation. Fixed-wing UAVs, in particular, and reliability of these aerial robots have become major exhibit highly complex, nonlinear dynamics, which can concerns due to the potential implications of system failures. Unlike other robotics fields, such as manipulation and Although current FCSs are robust, they struggle to maintain humanoid locomotion, where advanced control methods are performance when the vehicle dynamics deviate from the essential for managing complex joint movements, UAV original design specifications, sometimes leading to control Flight Control Systems (FCSs) in industry typically rely divergence and catastrophic failure.


A novel framework for adaptive stress testing of autonomous vehicles in multi-lane roads

Trinh, Linh, Luu, Quang-Hung, Nguyen, Thai M., Vu, Hai L.

arXiv.org Artificial Intelligence

Stress testing is an approach for evaluating the reliability of systems under extreme conditions which help reveal vulnerable scenarios that standard testing may overlook. Identifying such scenarios is of great importance in autonomous vehicles (AV) and other safety-critical systems. Since failure events are rare, naive random search approaches require a large number of vehicle operation hours to identify potential system failures. Adaptive Stress Testing (AST) is a method addressing this constraint by effectively exploring the failure trajectories of AV using a Markov decision process and employs reinforcement learning techniques to identify driving scenarios with high probability of failures. However, existing AST frameworks are able to handle only simple scenarios, such as one vehicle moving longitudinally on a single lane road which is not realistic and has a limited applicability. In this paper, we propose a novel AST framework to systematically explore corner cases of intelligent driving models that can result in safety concerns involving both longitudinal and lateral vehicle's movements. Specially, we develop a new reward function for Deep Reinforcement Learning to guide the AST in identifying crash scenarios based on the collision probability estimate between the AV under test (i.e., the ego vehicle) and the trajectory of other vehicles on the multi-lane roads. To demonstrate the effectiveness of our framework, we tested it with a complex driving model vehicle that can be controlled in both longitudinal and lateral directions. Quantitative and qualitative analyses of our experimental results demonstrate that our framework outperforms the state-of-the-art AST scheme in identifying corner cases with complex driving maneuvers.


Algorithmic Scenario Generation as Quality Diversity Optimization

Nikolaidis, Stefanos

arXiv.org Artificial Intelligence

The increasing complexity of robots and autonomous agents that interact with people highlights the critical need for approaches that systematically test them before deployment. This review paper presents a general framework for solving this problem, describes the insights that we have gained from working on each component of the framework, and shows how integrating these components leads to the discovery of a diverse range of realistic and challenging scenarios that reveal previously unknown failures in deployed robotic systems interacting with people.


Acceleration method for generating perception failure scenarios based on editing Markov process

Cai, Canjie

arXiv.org Artificial Intelligence

With the rapid advancement of autonomous driving technology, self-driving cars have become a central focus in the development of future transportation systems. Scenario generation technology has emerged as a crucial tool for testing and verifying the safety performance of autonomous driving systems. Current research in scenario generation primarily focuses on open roads such as highways, with relatively limited studies on underground parking garages. The unique structural constraints, insufficient lighting, and high-density obstacles in underground parking garages impose greater demands on the perception systems, which are critical to autonomous driving technology. This study proposes an accelerated generation method for perception failure scenarios tailored to the underground parking garage environment, aimed at testing and improving the safety performance of autonomous vehicle (AV) perception algorithms in such settings. The method presented in this paper generates an intelligent testing environment with a high density of perception failure scenarios by learning the interactions between background vehicles (BVs) and autonomous vehicles (AVs) within perception failure scenarios. Furthermore, this method edits the Markov process within the perception failure scenario data to increase the density of critical information in the training data, thereby optimizing the learning and generation of perception failure scenarios. A simulation environment for an underground parking garage was developed using the Carla and Vissim platforms, with Bevfusion employed as the perception algorithm for testing. The study demonstrates that this method can generate an intelligent testing environment with a high density of perception failure scenarios and enhance the safety performance of perception algorithms within this experimental setup.